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Abstract 



Patterns of animate and inanimate systems show remarkable similarities in their aggregation. One similarity 
is the double-Pareto distribution of the aggregate-size of system components. Different models have been 
developed to predict aggregates of system components. However, not many models have been developed 
to describe probabilistically the aggregate-size distribution of any system regardless of the intrinsic and 
extrinsic drivers of the aggregation process. Here we consider natural animate systems, from one of the 
greatest mammals - the African elephant (Loxodonta africana) - to the Escherichia coli bacteria, and natural 
inanimate systems in river basins. Considering aggregates as islands and their perimeter as a curve mirroring 
the sculpting network of the system, the probability of exceedence of the drainage area, and the Hack's law 
are shown to be the the Korcak's law and the perimeter-area relationship for river basins. The perimeter-area 
relationship, and the probability of exceedence of the aggregate-size provide a meaningful estimate of the 
same fractal dimension. Systems aggregate because of the influence exerted by a physical or processes network 
within the system domain. The aggregate-size distribution is accurately derived using the null-method of 
box-counting on the occurrences of system components. The importance of the aggregate-size spectrum relies 
on its ability to reveal system form, function, and dynamics also as a function of other coupled systems. 
Variations of the fractal dimension and of the aggregate-size distribution are related to changes of systems 
that are meaningful to monitor because potentially critical for these systems. 
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1. Introduction 

The understanding of the causes underlying the spatial 
organization of species in ecosystems is one of the 
most challenging and debated topics in ecology. This 
is also true for the spatial organization of components 
of other systems, such as inanimate natural systems. 
This is for example the case of river basins. As for 
human-made systems, the assemblage of these systems 
is mostly determined by human design; however, the 
human component dynamics makes these systems not 
completely deterministic. This is for example the case 
of cities. Questions arise about the level of complexity 
of theories and models to reproduce and characterize 
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the aggregation of systems components. Here a system 
component is broadly defined as the elementary unit 
that forms animate (biotic) or inanimate (abiotic) 
aggregates indistinctly. For example individuals of the 
same species form aggregates in an ecosystems and the 
whole metapopulation is defined by the whole set of 
aggregates. In living or animate systems, aggregates of 
systems components are observed from the microscale 
(for example, bacteria [1, 2]), the meso/macroscale 
(for example, cancer cells [3], and ants [4]), to the 
continental scale (for example, trees [5-9], fishes [9], 
African elephant [10], and corals [11]). In non-living 
or inanimate systems aggregates are observed as well 
at different scales [12]. Concepts developed for animate 
system components were generalized to inanimate 
system components. For instance, as in [13, 14], and in 
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[15], river basins can be considered as living systems 
if we consider them as animate systems (organisms) 
characterizable by a metabolism (proportional to the 
evapotranspiration and the drainage area) and a body- 
mass (proportional to the area of connected tributaries). 
Each subbasin is formed by canal pixels and hillslope 
pixels that are both system components. Analogies to 
organisms have also been formulated for cities [16]. 
This analogy between animate and inanimate systems 
allows a potentially mutual understanding of systems, 
and the ability to adopt similar probabilistic methods 
to characterize both systems. It is certainly difficult to 
claim universal principles of organization of systems; 
however, the development of methods to characterize 
both animate and inanimate systems is certainly useful 
for monitoring these systems and for system design. 

One of the most fundamental variable characterizing 
animate and inanimate systems is the aggregate-size. 
The aggregate-size (named as patch-size, or cluster-size 
in ecology) is defined as the area of the landscape in 
which individuals of systems are aggregated together 
[17, 18]. In ecological modeling the definition of 
aggregates for species is generally a non-trivial task 
that requires the definition of many biological variables. 
These local system variables are the occurrence of 
species, the minimum area to support a population, 
the habitat quality, and the sex-structure of species 
to list just a few. The aggregates of species are 
generally the input of metapopulation models for 
determining species abundance by considering the 
stochastic dynamics of birth-death and dispersal in 
and among metapopulation aggregates. The stochastic 
dispersal occurs on a network that connects species 
aggregates. The occurrence of system components is 
certainly one of the most important variables in both 
defining the aggregate-size and for the inference of 
system dynamics. For example, in ecosystems the 
location of species occurrences is also useful to estimate 
the abundance of species [19], the relationships 
between species and environmental variables for the 
definition of niches and climate change effects on 
species [20], and the interactions with other species 
[21]. 

The importance of the aggregate-size relies also on 
its distribution within the system analyzed and the 
variation of this distribution in time. The probabilistic 
structure, and more precisely the distribution of 
the aggregate-size of systems components is widely 
reported to be a power-law. This is particularly the 
case for single and multiple species considered together. 
However, exponential probability distributions of the 
aggregate-size are observed for some species [22-24], 
for perturbed and evolving ecosystems (for example, 
for vegetation due to grazing [25, 26], and for 
ecosystems characterized by strong gradients of some 
environmental variables (for example, for vegetation in 



the Kalahari rainfall transect in Africa [27]). Power-laws 
of the aggregate-size for inanimate systems (natural 
and man-made), such as river-basins, landslides, snow- 
cover in landscapes [28], and cities [29] are also 
observed. Even for inanimate systems deviations from 
the power-law distribution are observed: for example 
exponential aggregate-size distributions are reported 
for cities subjected to rapid urbanization [30]), and 
log-normal distributions are observed for submarine 
landslides [31]. A consistent part of the literature 
investigates the origin of the power-law distribution 
of aggregates and the causes of deviations from 
this distribution [26]. Here we confine our interest 
in animate and inanimate natural systems whose 
distribution of the aggregate-size is a power-law which 
seems to occur in the majority of cases in which 
ecosystems are at stationary state in their evolution [32], 
or around a stable state in the energy landscape [26]. 
One of the explanations for the power-law distribution 
of the aggregate-size provided by literature is related 
to the typology of species movement in ecosystems. 
Individuals of species, from bacteria to elephants, seem 
to follow a simple Brownian [33], or a Brownian- 
Levy movement [34-36]. This typology of movement 
arises from an optimal foraging strategy determined for 
instance by an optimization of the species-dispersal for 
survival [22, 37-39] constrained by the environment 
topology and constraints (for example, the river 
network and basin ridges in a river basin, or a Petri 
dish for in vitro bacteria populations). This type of 
dispersal is generally simulated with a combination of 
exponential and "heavy-tailed" dispersal kernel [39] 
that results in a scale-free distribution of the aggregate 
patches of the species simulated [9, 40]. 

Aggregates of species are linked together by phys- 
ical networks (for instance, river networks) [9] or 
process-networks [41] (for example, communication 
networks and dispersal networks that describe the 
communication and movement of individuals respec- 
tively) from which the observed patterns arise. Process- 
networks can be embedded into physical networks (for 
example, dispersal networks of fishes are constrained 
within the river network [42]) or can exist without 
a visible physical network in the system space (for 
instance, the communication network among bacteria 
colonies or among transceivers considering animate 
and inanimate systems respectively). Many detailed 
processes are responsible for the formation of aggre- 
gates, and many models were developed in literature 
to predict aggregates of species. Some models tried 
to mimic the fine-level details of ecological processes, 
such as species interactions (for example, conspecific 
attractions [43]) and feedbacks (for example, density- 
dependence) among species. Other non physical-based 
models were built around other "macroscopic" theories 
that consider the ensemble average behavior of systems 
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components, such as the theory of self-organized crit- 
icality [44-47], allelomimesis [48], preferential attach- 
ment [41], metabolic optimization [49, 50], percolation 
[51, 52], habitat suitability, and the neutral theory 
of biodiversity [9, 42, 53-55]. Network-based model 
were developed on these theories to reproduce pat- 
terns of aggregation of complex systems. The network 
framework is a simplified and valuable framework 
that allows to capture average properties of systems 
dynamics and organization. An example is the theory 
of optimal channel networks (OCNs) [56] that without 
the inclusion of geomorphological details is capable 
to describe analytically the topological properties of 
river networks. Aggregation phenomena of species were 
successfully modeled using the framework of OCNs, 
or other network-based models. While network-based 
model were developed to reproduce animate and inan- 
imate processes, not many network-based models were 
developed to probabilistically describe patterns of these 
processes created using data or model predictions. At 
the same time no consistent advancement occurred in 
the development of analytical forms for the probability 
distribution of the aggregate-size. In literature different 
analytical forms are found for different types of power- 
law distribution of the aggregate-size. The purpose of 
this study is (i) to provide insights into a parsimonious 
model (box-counting [57]) for assessing aggregates of 
systems just using occurrences of systems components, 
(ii) to integrate theories of aggregate-size distributions 
( the Korcak's law [58], the perimeter-area relationship, 
and the theory of fractal river basins [56]) for animate 
and inanimate species and test the validity of this 
integration on all the systems analyzed, and (iii) to 
formulate a generalized analytical form for all the types 
of power-law probability distributions of the aggregate- 
size. We particularly focus on systems that exhibit a 
power-law spectrum of the aggregate-size. 

The paper is organized as follows. Section 2 describes 
the assumptions, the theoretical framework, the data, 
and the models used to predict the aggregates of 
systems components. Section 3 reports the results of 
the box-counting method, of other models, and the 
validation of the theory. The discussion of the results 
is in Section 4 Section 5 lists the most important 
conclusions, perspectives for future research, and 
potential applications of our findings. 

2. Materials and Methods 

2.1. Korcak's law and Aggregation Hgpothesis 

Studies about the prediction of aggregates of systems 
and their theoretical characterization was developed 
separately among scientific disciplines; this is because 
aggregation phenomena occur in a broad variety of 
systems of different nature. In geography it is well 
known that the size of islands follows a power-law 



probability distribution, P(S > s) ~ s~ b , in which the 
exponent of the exceedence probability distribution 
is related to the fractal dimension of the coast of 
the islands [58]. This power-law probability is called 
"Korcak's law". The exponent is half of the fractal 
dimension of the island coastline (b = 1/2 D) [59]. 
Mandelbrot [59] found that the average value of this 
exponent is 0.65, with variations from 0.5 for African 
islands to 0.75 for Indonesian islands; thus, b is within 
the range [1; 1.5]. 

Landscape ecology is the field in which the theory 
of aggregates received the highest attention due the 
importance of the species aggregate-size distribution 
for species conservation. [60] and [61], studied the 
patches of river ecosystem properties (for instance, 
slope, hydrogeology, erosion, and vegetation) in New 
Zealand. These are the first studies, to the best of 
our knowledge, that tried to unify theories, including 
the "Korcak's law", for the characterization of patches 
in heterogeneous ecosystems. However, the fractal 
exponents derived in these studies were considered 
as independent estimates of features of individual 
aggregates and of the whole mosaic of aggregates. 
Moreover, these studies did not correlate any network 
of the ecosystem analyzed to the pattern of aggregates 
of system components. 

In geomorphology the aggregation of subbasins 
around river networks was elegantly investigated by 
many studies, starting from [62], to the comprehensive 
review of [56] in which the theory of optimal channel 
networks was proposed. In ecology, [9] analyzed the 
aggregation of species in river networks and 2-D 
landscapes, considering the exponent of the exceedence 
probability distribution of the aggregate-size as a 
meaningful indicator of the collective organization of 
species. That exponent was considered as a function 
of geometrical and environmental constraints of the 
ecosystem where aggregates form. 

In this paper the following hypotheses have been 
tested. 

1. It is possible to predict the aggregates of animate 
and inanimate systems and the aggregate-size 
spectrum solely from the occurrences of systems 
components. In the case of inanimate systems we 
consider the center of mass of each aggregate as 
an occurrence of system component. The box- 
counting method on the occurrences of systems 
components is a reliable method for calculating 
the aggregate-size. The accuracy of the box- 
counting in the aggregate and aggregate-size 
predictions is assessed with respect to other 
methods based on prediction of aggregates' area 
and perimeter. 

2. The theory of fractal river networks can be gen- 
eralized in order to characterize the probabilistic 
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structure of the aggregate-size of other systems 
arranged along river networks (for example, land- 
slides as inanimate systems, and fishes and trees 
as animate systems), and arranged along non- 
visible networks of system processes occurring in 
landscapes (for example dispersal networks). In 
any system a meaningful self-affine or self-similar 
network can be traced and statistical properties 
of aggregates can be sampled along the network 
such as in [63] for the subbasin drainage area. In 
fact, the coalescence of system components can 
be described as an aggregation phenomena along 
branching trees [64, 65]. We consider the Korcak's 
law [58], that is the power-law probability dis- 
tribution for the size of islands as the generic 
probability distribution for the aggregate-size of 
any animate and inanimate systems components. 
Thus, aggregates (such as subbasins and species 
aggregates) are considered as islands and systems 
networks (such as river networks and dispersal 
networks) as coastlines in analogy. Aggregates and 
aggregates' boundaries are sculpted by network 
processes. S, c, and Lii are defined as the size, 
perimeter, and diameter of aggregates respec- 
tively, and / as the length of the aggregates' sculpt- 
ing network (Figure 1). We tested the analogy 
between the theory of fractal river basins against 
the Korcak's law and the perimeter-area relation- 
ship by assessing the fractal dimension of aggre- 
gate patterns for these three conceptual models. 
We believe about the existence of a unique frac- 
tal dimension estimated by these models. This 
fractal dimension is a representative indicator for 
the whole set of systems aggregates and of each 
aggregate on average. 

3. A novel analytical formulation for the double- 
Pareto probability distribution (or spectrum) of 
the aggregate-size of systems components is 
formulated. Such distribution can fit the Pareto 
distributions that animate and inanimate systems 
components show. 

2.2. Systems Data 

Animate and inanimate natural systems are considered 
for a wide range of body-mass of system components, 
climatological condition, and biological dynamics in 
order to verify the validity of the proposed probabilistic 
description. Available data of our current and past 
research studies allow to consider animate and 
inanimate systems at different spatial scales (Figure 
2) that exhibit a power-law probability distribution 
of the aggregate-size of system components. These 
systems are the E. coli bacteria in nutrient-rich substrate 
(courtesy of [2]), the subbasins of a portion of the 
Tanaro basin (Italy) [63], the Snowy Plover in Florida 



in 2006 [66], the historical landslides of the Arno river 
basin (Italy) [67], the African elephant in the Kruger 
National Park (KNP) in 2006 (South Africa) [68], and 
fish and tree species in the Mississippi-Missouri River 
System (MMRS) (USA) [9, 42, 55]. As for the trees of 
the MMRS we consider only big trees for which the 
diameter at breast height is larger or equal than five 
inches [9]. 

Figure 2 shows the occurrences of the aforementioned 
systems in order of the extension of the system domain 
where they occur. The extension of these systems covers 
fifteen orders of magnitude from the Petri dish of the E. 
coli (6.1 x 10~ 9 km 2 ), the Tanaro basin (5.3 x 10 2 km 2 ), 
the beach habitat along the Gulf coast of Florida (~ 
5.6 xlO 2 km 2 ), the Arno basin (8.23 x I0 3 km 2 ), the 
KNP (19.0 x I0 3 km 2 ), to the MMRS (2.98 x W 6 km 2 ). 
For the African elephant, the Snowy Plover, and the 
E. coli we evaluate one pattern of occurrences as a 
realization of a process in which aggregation always 
occurs [69, 70]. For the Snowy Plover occurrences are 
available from 2002 to 2011 obtained by field survey 
[66], and for the African elephant from 1985 to 2004 for 
the dry season obtained by plane survey. We anticipate 
that a temporal analysis of the occurrence patterns is 
the subject of forthcoming papers. Here we examine 
the years for which the reliability of the occurrence 
patterns is the highest, in terms of data quantity and 
data quality. Other yearly-sampled occurrences of both 
African elephant and Snowy Plover show the formation 
of very similar aggregates. This is the case also of the E. 
coli bacteria in which self-similar patterns are observed 
in Petri dishes [71] for different values of the nutrient 
concentration. 

2.3. Box-counting 

The first step of the box-counting method is the creation 
of a coarse grid of boxes to overlay on the top of the 
system domain analyzed. The grid is then refined at 
each step until the lower cutoff of the analysis. The box- 
counting technique [57] leads to a scaling relationship 
between the number of boxes (N(r)) in which at least 
an occurrence of system components is contained and 
the length of the side of the box (r). The relationship is 
a power-law, N(r) ~ r~ Db , where is the Minkowski- 
Bouligand dimension that is a good estimate of the 
fractal dimension (or Hausdorff dimension) of the 
point-pattern of occurrences analyzed. The box- 
counting technique is applied to all the point-patterns 
of Figure 2 for at least 2 16 orders of magnitude of r. The 
box-counting is illustrated in Figure 2 (c) for the Snowy 
Plover occurrences. Variabilities of measured exponents 
(Db) for different systems are expressed as standard 
errors found by a Maximum Likelihood Estimation 
method (Section redmle) bootstrapping over cases and 
deriving exponents using the linear and the jackknife 
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models [72]. For the river basin, the landslides, and the 
E. coli colony patterns the center of mass of each system 
component (i.e. a subbasin, a landslide, and a E. coli 
colony respectively) is considered in the box-counting 
analysis. In Figure 1 the center of masses of ideal 
aggregates are shown as grey dots. For the system with 
occurrence data of system components available (i.e., 
big trees and fishes of the MMRS, African elephant 
in the KNP, and SP along the Gulf coast of Florida), 
the point-patterns of occurrences are directly analyzed 
without any pre-processing. 



2.4. Models of Aggregate Prediction 

Aggregates of systems considered in this paper are 
predicted by models based on different assumptions, 
hypothesis, and at different levels of complexity. In the 
following we give a brief explanation of the models. We 
remind the reader to papers in which each model was 
implemented for more details. The area of an aggregate 
is defined as the sum of adjacent pixels considering the 
Von Neumann neighboring criteria. The perimeter of 
an aggregate is defined by the sum of the sides of the 
external pixels composing the aggregate. 

For river basins, landslides, and E. coli colonies 
the aggregates are extrapolated by an image analysis 
model. The observed E. coli pattern (courtesy of [2]) 
(Figure 2, a) is binarized by extracting pixels whose 
grayscale value is higher than 30 (white pixels are 
logical "true"). This threshold allows to reproduce the 
observed patterns with an accuracy of 92 %. The area 
and perimeter are calculated for all the aggregates 
extracted using the grayscale threshold criteria. The 
code for extracting and calculating the aggregates is 
developed by the first author using Matlab [73]. 

The subbasins of the Tanaro basin in Figure 2 (b) are 
derived in [63] by extracting the river network from the 
digital elevation model (DEM). The network extraction 
is based on the identification of the contributing areas 
for each stream of the network. The extraction of the 
network and other hydrogeomorphological analysis are 
performed using the free software HydroloGis [74]. As 
for the landslides in the Arno basin the over 27,500 
recorded landslide occurrences were identified in [67] 
using aerial-photo interpretation, expert knowledge, 
and remote sensing techniques. Details are explained in 
[67]. 

For the SP (Figure 3, b) a habitat suitability 
model coupled with a patch delineation model is 
used to determine the aggregates of species [75]. 
[75] defined as a shorebird aggregate an aggregate of 
pixels whose habitat suitability index is higher than a 
certain threshold, big enough to support all together a 
meaningful population size but not too small to support 
at least a breeding pair, and close enough to support 



breeding and wintering activity. The habitat suitability 
index is based on habitat suitability maps predicted 
by a maximum entropy model [76, 77] constrained 
on environmental variables. The closeness of pixels 
is evaluated by a neighborhood distance that is a 
proxy of the average home-range dispersal distance. 
The dispersal distance for mammals is, in fact, proven 
to be proportional to the home-range size [78]. Pixels 
whose mutual distance is lower than the neighborhood 
distance are part of the same aggregate. 

As for fish and tree species in the MMRS (Figure 
3, c and d respectively) [9] determined the aggregate- 
size spectrum of species by implementing a neutral 
metacommunity model (NMM) proposed by [42, 79], 
and further improved by [55]. The predicted aggregate- 
size spectrum match the spectrum calculated using 
data of species occurrences. The NMM is a stochastic 
speciation-dispersal model based on the individual per- 
capita species equivalence assumption. The neutral 
hypothesis [53] holds for the same taxonomic group. 
An aggregate of fish and tree species is defined 
as the number of contiguous local communities (a 
local community is a "direct tributary area" [9]) in 
which a species occurs along the network or according 
to a Von Neumann neighboring criteria in a 2D 
domain respectively [9]. For fish and tree specie 
of the MMRS the aggregates of each species that 
are assumed equivalent to each other are considered 
together in determining the aggregate-size spectrum. 
Thus, the aggregate-size spectrum is representing a 
metacommunity pattern of species diversity rather than 
of single metapopulations. 

For the African elephant the size and perimeter 
of elephant aggregates are computed considering the 
adjacent boxes of the box-counting method (Section 
2.3) at a biologically relevant resolution of grid. We 
consider as aggregates the boxes whose unitary side 
length is 38 km that is the square root of the home 
range. For the African elephant in the KNP the home 
range varies from 400 to 1500 km 2 in the wet (summer) 
and in the dry (winter) season respectively [10, 80- 
82]. The choice of the unitary side length length for 
definition of the aggregates has a very limited influence 
on the aggregate-size distribution for scale-free patterns 
which is also the case of the African elephant [70]. 
Unfortunately, for the African elephant we do not have 
any information about the observed aggregates and 
the only data available are part of an ongoing project 
in which a stochastic network-based metapopulation 
model is implemented [70] using only occurrences and 
habitat capacity functions without the requirement of 
calculating aggregates. 
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2.5. Theoretical Construct 

The theoretical characterization of aggregates is based 
on hypothesis about relationships among aggregate 
geometrical features. Identical relationships have been 
formulated for river basins by [83]. The allometric 
ansatz for the size S and the perimeter C of the 
aggregates are: 



S = k s 
C = k c ll 



(1) 



where: Lm is the main diameter of aggregates that 
is a proxy of aggregates' characteristic length and 
is measured along the principal axis of inertia of 
aggregates (Figure 1 and Figure 2); L x is the transversal 
diameter of aggregates; D s = 1 + H because S ~ L\\L ± 
and L ± ~ L^, where H is the Hurst exponent; and is 
dt according to the theory of fractal river basins [56]. 
The dj exponent that characterizes the characteristic 
length of the aggregate characteristic curve, is the 
fractal dimension of a stream for fractal river networks. 
A stream that is each rivulet going from any site of 
the basin to the outlet, is a fractal set with the same 
fractal dimension along its path. In general, Dg, and 
Dc are fractal dimensions related to the morphological 
structure of aggregates. 

The ansatz is to consider that half of the perimeter 
(C/2) scales with a power of one with / that is 
the mainstream length in river basins (Figure 1). 
In general / is definable as the length of the 
aggregate characteristic curve. The aforementioned 
scaling relationship is verified for river basins. For 
river basins it was suggested that basin boundaries and 
mainstream courses are in essence mirror images of 
each other [61, 84-87]. This assumption generates the 
second allometric law in Equation 1 irrespectively of 
the constant. We assume the relationship to hold for any 
aggregate along a line drawn into the domain on which 
the aggregates are self-organized (Figure 2) and within 
any aggregate (Figure 1). The characteristic curve can be 
the mainstream of a river basin, a rivulet of a subbasin, 
or any other characteristic curve drawn within the 
system domain. S can be imagined as the body-mass of 
a system component in a biological perspective as for 
river basins in river systems [14]. 

From Equation 1 by incorporating the first relation- 
ship into the second one, the perimeter-area relation- 
ship (PAR) [88] is derived as: 



C~S h 



(2) 



where h = df/(l + H) = D c /2 is the Hack's exponent. 
h = by considering the ansats in Equation 1. In 

the ecology literature D c is classically identified as the 
fractal dimension of an aggregate derived from the PAR. 



Equation 2 is commonly known as Hack's law in fluvial 
geomorphology [89] where C is the mainstream length 
and S is the size of a basin. The interchange of the 
mainstream with the basin boundary is supported by 
our ansatz and by the empirical evidence of the scaling 
of the basin perimeter with the mainstream length 
with a power of one (C ~ I) [61, 84-87]. The Hack's 
law validity is proven in any embedded subbasins 
within river basins. This shows the self-affinity of river 
basins and the possibility to extend this law to any 
system. Here we test the validity of Equation 2 also for 
any aggregate of the animate and inanimate systems 
considered. 

The probability density function of the aggregate- 
size can be universally described by the double-Pareto 
distribution: 



p(s) 



- 1 0(s)&(t- 



s) + s 



-e-1 



@(m-s)&(s-t) 



\t_l_ m- e -t- e 1 
I P + e J 



for s < t 
for s > t 



(3) 



where, t is the truncation point ("hard truncation") 
where a change of scaling can occur, and m is the 
upper cutoff corresponding to the maximum value of 
the aggregate-size (Figure 3, e). f(x) is a function such 
that /(x) = 1 if x «; 1 and/(x) = if x » 1. Here /(x) = 
0(1 - s/m). p and e are the scaling exponents of the 
aggregate-size spectrum. The double-Pareto probability 
density function (pdf) of the aggregate-size has been 
widely studied [90], for example for landslides [91- 
93]. Here we propose the novel analytical formulation 
in Equation 3 and we verify if such distribution is 
reproduced by the box-counting method on data versus 
model predictions. The probability of exceedence of the 
aggregate-size is by integration of Equation 3: 



N[ 



P + 

N s m ~ 



for s < t 
for s > t 

t 

Co - C\ for s < t 



(4) 



where, N = [ 



t± , m~ e -t- 
P + 



, Cq and C\ are constants, 
F is a homogeneity function that depends on a 
characteristic size of aggregates m ~ S ~ L^ +H , and e = 
D K /2 [59]. D K is the fractal dimension of aggregates. 
Thus, Equation 4 is a novel formulation of the Korcak's 
law [58] that allows a double scaling regime of the 
aggregate-size distribution. The distribution is tested 
against the aggregate spectra predicted by the box- 
counting and by the models (Section 2.3 and 2.4 
respectively). The fit of the distribution is evaluated 



6 



ICST Transactions Preprint 



Power-law of Aggregate-size Spectra 



by a Maximum Likelihood Estimation (MLE) approach 
(Section 2.8). The determination of D K is independent 
from the PAR because it considers only the aggregate- 
size. In the theory of fractal river-basins [56] a subbasin 
is a unit of a river basin system subjected to geological 
and climatological forces. The random variable s is the 
drainage area of a subbasin in river basin ecosystems. 
The interaction among subbasins happens along the 
drainage ridges that divide the runoff among adjacent 
subbasin hillslopes. 

2.6. Sampling of Aggregate Areas 

For aggregates of systems, we consider the aggregate 
areas that are distributed in the system along a real 
or an ideal curved line and along a perfectly straight 
line in the system domain. The former case is the self- 
affine case, and the latter case is the self-similar case 
of aggregates for which H < 1 and H = 1 respectively. 
The theoretical characterization of the distribution of 
areas was performed by [63] for subbasins organized 
along a fractal mainstream, and along a perfectly 
straight mainstream. Similarly, [94] considered the case 
of subbasins along a fractal coastline showing indirectly 
the generality of the theoretical characterization of the 
area of river basin aggregates. 

The subbasin area contributes to the formation of 
the drainage area. The drainage area is a cumulative 
function that is the sum of all subbasins' areas upstream 
a point which has hydrological and geomorphological 
implications [56, 95-97]. The constraint of conservation 
of the total area [56, 98] suggests that the distribution 
of areas sampled along a given straight line or along a 
curve where multiple aggregates occur (i.e. pb(s\Ln), and 
p ms (s\Lu) in [63] respectively), differs from the Korcak's 
law [58] for the drainage area p(s|Lii) in the scaling 
exponent. This is supported by empirical evidence for 
river basins [63]. Indeed if at i sites one collects the 
areas S; and must enforce the constraint J^j S, = S max 
(where S max is the total area), the resulting population 
is about a different random variable from that leading 
the exceedence of the drainage area because the analog 
areas S, sampled anywhere do not add to the total area 
[56, 63]. Hence, this is true for any other system. 

The Hack's exponent is in fact different for the 
three distributions mentioned above: h = 1 — e = 2 — X 
for the drainage area; h = e = r ms - 1 for the areas along 
a curve; and h = 2 — e = Tj, — l for the areas along a 
straight line, t, % m& , and Tjj are defined in [63] as the 
scaling exponents of the probability density function 
of these areas that occur along a curve (self-affine case) 
or a straight line (self-similar case) (i.e., 1 + df/(\ + H), 
and 2 - 1/(1 + H) respectively). In order to obtain the 
desired distribution of areas the correct e needs to be 
introduced in Equation 4. 



2.7. Validation of the Theoretical Construct 

Because of the validity of Equation 1, Equation 2, and in 
analogy with the analytical framework of river basins' 
drainage area [56], we assume that the slope of the 
probability of exceedence of the aggregate-size is -e = 
h- 1 [56]. The validation of the model for the aggregate- 
size distribution is tested by comparing: 

1. the Hack's coefficient derived from the PAR 
(h) versus: (i) kg = 1 - e = df/1 + H from the 
Korcak's law for the river basin drainage area 
(we consider self-affine basins); (ii) h K = e for the 
aggregates of all systems in the self-affine case; 
and versus (hi) h K = 2 - e for the exact self-similar 
case of aggregates (for which H = 1) that is the 
case of bacteria aggregates. h K is calculated only 
from the aggregate-size spectrum; 

2. the Hurst coefficient H derived from the scal- 
ing relationship L ± ~ , versus H c = df/h-l 
derived from the PAR by assuming an average 
value of df = 1.1. H is determined only by calcu- 
lation of the diameters of the aggregates. 

The first validation is to test the relationship between 
the aggregate-size distribution and the perimeter-area 
relationship, while the second validation is to test the 
relationship between the perimeter-area relationship 
and the allometry relationship of aggregates consider- 
ing their diameters. 

2.8. Maximum Likelihood Estimation of the 
aggregate-size Spectrum 

The Maximum Likelihood Estimation (MLE) method 
here employed was developed in [99] for the selection 
of the best-fit probability distribution function on 
data. In this study, power-law (Pareto), the proposed 
truncated power-law (truncated Pareto-Levy) (Equation 
4), and exponential distributions are tested for the 
random variable aggregate-size S. These distributions 
are tested on the the aggregate-size calculated by the 
box-counting and the models explained in Section 2.4. 
The appropriate MLE equation for each distribution 
is used to derive an exponent with an initial s m!f , 
parameter set to the minimum value found in the data 
and model predictions. A best fit dataset is generated 
with the estimated parameter and a Kolmogorov- 
Smirnov (KS) test is used to determine the goodness of 
fit (the KS-D statistic). The KS test is the accepted test 
for measuring differences between continuous data sets 
(unbinned data distributions) that are a function of a 
single variable. 

This difference measure, the KS-D statistic, is defined 
as the maximum value of the absolute difference 
between two cumulative distribution functions. We 
consider the KS-D statistics in a [0, 1] range. The KS-D 
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statistic between two different cumulative distribution 
functions Pnj(s) and Pm 2 {s) is defined by KS - D = 
max.^^oo | PjVjfs) - Pjv 2 (s) |. To determine the best fit 
value for the s m ,„ parameter the calculation is repeated 
with increasing values for s mifI taken from the dataset 
with the value that resulted in the best (lowest) KS- 
D statistic being retained as the best fit value [99, 
100]. When fitting a Pareto distribution the method is 
repeated to derive a best fit value for the s max parameter, 
so for the Pareto distribution both the s ml „ and s max 
parameters are fitted in the same way. This method is 
applied for any scaling regime of the data. The slopes of 
the exceedence probability are derived using the linear 
and the jackknife models [72]. The MLE method is used 
to verify that the proposed Pareto-Levy distribution (or 
commonly called "double-Pareto") has the best fit for 
the observed and predicted aggregate-size spectra. 

3. Results 

The Korcak's law for the animate and inanimate 
systems considered is shown in Figure 3 from plot 
(a) to plot (f) in order of their average aggregate-size 
that is proportional to the average body-mass of system 
components. The aggregate-size is calculated by models 
at different level of complexity, and by image analysis 
methods as described in Section 2.4. The aggregate- 
size spectrum is tested against the predictions of the 
box-counting method. In the plots of Figure 3 the box- 
counting relationship is reported with grey squares 
fitted by a linear regression. The spectra from the 
box-counting are adjusted by dividing by two the 
scaling exponent in order to be compared to the 
Korcak's law spectra that provide the distributions 
of the aggregate-size with an exponent that is half 
of the fractal dimension. The proposed Pareto-Levy 
distribution (Equation 4) has the best fit for the 
observed and predicted aggregate-size spectra with 
respect to the other distributions considered by the 
MLE method (Section 2.8). The KS-D statistic is always 
higher than 0.87 for this distribution for all the systems 
considered. 

The E. coli bacteria and the Snowy Plover are 
the systems that exhibit a pure power-law (Pareto 
distribution) of the aggregate-size for two and three 
orders of magnitude of the aggregate-size respectively. 
Fishes and big trees of the MMRS exhibit a truncated 
power-law distribution of the aggregate-size with finite- 
size effects ("soft truncation"). The soft truncation is a 
well-known feature of power-law distributions due to 
finite-size effects (see [56]). Landslides and the African 
elephant are the only systems that show a truncated 
double-Pareto distribution with "hard truncation". The 
hard truncation separate the two scaling regimes of 
the aggregate-size distribution (Equation 4). On the 
contrary of [91] and [92] we are able to reproduce 



the double-Pareto distribution also for the exceedence 
probability distribution of landslides. The transition 
value, from one scaling regime to an other with 
different exponents of the power-law distribution, is a 
characteristic value that can be related to the system 
domain or to biological constraints [26, 101-104]. 
Double-Pareto size spectra were reported for example 
for forest fires for which the two scaling regimes were 
attributed to the two-layer structure of the forest which 
allows the formation of different kind of fires [105]. 
However, man-made constraints can exist and influence 
the spatial distributions of system components, such 
as fences of the KNP for the elephants [10], and the 
Petri dish domain for the E. coli [71]. The influence of 
strong geometrical constraints on species organization 
is a very important topic to investigate with process- 
based models; however, it is outside the scope of this 
paper. 

In the following we try to discuss some possible 
origins of the double-Pareto distribution of the 
aggregate-size for elephants and landslides in the light 
of our previous comment and because our knowledge 
of these systems. For elephants the social life of 
male elephants (bulls) and female elephants are very 
different [106, 107]. The females spend their entire lives 
in tightly knit family groups. These groups are led by 
the eldest female, or matriarch. Adult males, on the 
other hand, live mostly a solitary life [106]. The spatial 
distribution of male elephants is more homogeneous 
than for females and this leads to the one power- 
law regime of the aggregate-size for male elephants 
aggregates (black spectrum in Figure 3, e). However, 
some eldest females are also observed to be solitary 
especially at the very end of their life. Hence, the 
aggregate-size distribution of female elephants shows 
a double power-law regime (orange spectrum in Figure 
3, e). Thus, the different power-law structure of the 
aggregate-size for female and male elephants may be 
explained by their different social life. This in turn 
affects the dispersal network and the aggregate-size 
distribution. For the elephants the variation of e and 
is estimated +0.005 for a variation of +\0km of the box 
length of the box-counting that is used to calculate the 
aggregates. 

For landslides the origin of the double-Pareto 
distribution has been a matter of debate among 
geomorphologists. On average small landslides tend to 
occur much closer to the river network in sites with 
small hillslope-to-channel distance. On the contrary, 
large landslides that tend to involve big portions of 
the hillslopes and their center of mass is further 
from the network. The fact that the center of mass 
of most of landslides is always observed futher 
up on the hillslopes may simply occur due to 
geomorphological reasons as evidenced in [108]. The 
constraint would be dictated by the dimension of the 
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valley that is expressed by the subbasin ridge to channel 
distance. The double scaling of the landslide-size 
is also attributed to different triggering mechanisms 
for example seismic-induced landslides are big and 
slow phenomena, while storm-induced landslides are 
small and rapid phenomena. It is also probable that 
the double-Pareto distribution of the landslide-size is 
observable because at smaller scales the cohesion forces 
become prevalent, thus hindering the development 
of more frequent mass movements; while at larger 
scales the main resisting mechanical forces are of 
frictional type only [108]. The double scaling has also 
been attributed to undersampling of small landslide 
events that are difficult to be recorded. The existence 
of the undersampling effect of small landslides is 
certain to exist. Nonetheless, independently of any 
undersampling it was shown that under a given scale 
the frequency distribution of the landslide-size has 
a roll-over effect that changes the sign of the first 
derivative of the Pareto distribution [108]. We believe 
that despite all these suppositions about the origin 
of the double-Pareto distribution of the landslide-size, 
the effect of the river network is certainly driving the 
distribution of landslides. 

The collapse test [109] that verifies the ansatz 
(Equation 1) is shown in Figure 4 (a). The product 
P{> s) s £ has a different constant for each system 
considered. Thus, we decided to rescale everything 
to the same constant for better visualization. Two 
theoretical predictions are validated, namely: the 
perimeter-area relationship (Equation 2); and, the 
probability distribution of the aggregate-size (Equation 
4) that is shown to follow a double power-law structure 
(Figure 3. The collapse test verifies our assumption 
that the perimeter-size relationship is a more broadly 
defined Hack's law, and that the power-law distribution 
of drainage areas is a special case of the Korcak's 
law for river basins. Values of H from the PAR 
match the values from direct observations. It is safe to 
assume, in this context, that df ss 1. In all cases of the 
systems analyzed the theoretical prediction is verified 
quite well. Overall, the theoretical framework seems 
consistently verified. We plot the normalized scaling 
perimeter-area relationship (PAR), C/C max ~ (S/S max ) h 
(Figure 4 (b)), because of the large range of perimeters, 
from a few centimeters of the E. coli bacteria aggregates 
to the large perimeters of big-tree aggregates of the 
MMRS. 

The first test (Section 2.7) of the the Hack's coefficient 
derived from the PAR (herein h) versus from the 
Korcak's law is verified. Table 1 reports the numerical 
values. Thus, we relate the aggregate-size spectrum 
with the perimeter-area relationship of the aggregates, 
while previous studies. For instance [1 10] and [60]) did 
not find any linkage between the scaling exponents of 
the two relationships. The second test (H c = H) appears 



to be less stringent than the first. It is verified only 
for the range of variability of the Hack's exponent 
0.5 < h < 1, that is the range commonly observed for 
river basins [89]. For h < 0.5 and h > 1 the Hack's 
exponent seems well approximated by H c = 1- | df/h - 
1 |. For h > 1 the edge effect of aggregates is very 
high, that means that the species are confined in 
very irregular aggregates. It was demonstrated that the 
larger the edge-effect determined by the complexity 
of the aggregate perimeter, the lower the probability 
of survival for the individual of the species within 
the aggregate. This is the also the case observed for 
big landslides, for fishes (supposedly because of the 
dendritic structure of the river network that create very 
irregular aggregates), and for E. coli colonies. For h < 
0.5 the compactness of habitat aggregates is very high. 
For example this is the case observed for the solitary 
male elephants in the KNP For 2 < e + 1 < 3 that is 
the case of big landslides and elephants aggregates 
a finite mean and infinite variance of the aggregate- 
size is observed. The general case observed for all the 
other systems satisfies e + 1 < 2 for which the mean and 
the variance of the aggregate distribution is infinite. 
This may lead to the conclusion that the aggregate- 
size may theoretically increase without an upper limit. 
However, it is somehow arguable to speculate about 
mean and variance of aggregate-size to this extent 
because theoretical studies are required to verify these 
conclusions. 

For the systems analyzed, the Korcak's exponent 
exhibits a wider range of values than reported in 
literature [110]. We find that the values of the scaling 
exponent, e + 1, is consistent with the range provided 
by [48]. For large elephants herds and big landslides 
(that is for s > 7.0 x 10 3 , and s>1.0xl0 3 which 
corresponds to the hard truncation points in Figure 3 
(e) and (f) respectively) we find a fractal dimension 
bigger than three. We attribute this singularity to the 
disproportionate increase of the aggregate length when 
the unit of measurement (e.g. the box-length of the 
box-counting method) is decreased. Very elongated 
aggregates for both elephants and landslides are 
observed. In general, small aggregates tend to be self- 
similar, while big aggregates tend to be self-affine. 
The self-affinity (elongation) of aggregates can also be 
enhanced by geomorphic elements of ecosystems, such 
as the presence of river networks. This is the case 
for example of subbasins and fish aggregates. River 
networks plays a determinant role also in shaping the 
distribution of riparian trees [111], and the distribution 
of elephants [112]. Both trees and elephants are in fact, 
water-dependent species and the closeness to water is 
a fundamental driver of their organization. We find 
different "fractal domains" [113] (or scaling regimes) 
separated by "hard" truncation points of the aggregate- 
size spectrum. These regimes possibly identify different 
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dynamics of species organization resulting in different 
aggregate patterns as suggested by [101]. With the hard 
truncation the "heavy-tailedness" of the aggregate- 
size spectrum is less strong than for distributions 
with finite-size effects. Generally there is a lack of a 
characteristic aggregate-size in presence of a power- 
law distribution of the aggregate-size. However, every 
population of species (or system components) is finite 
because it is constrained by landscape heterogeneities, 
anthropic constraints, and/or biological factors. We 
believe that these factors control together the minimum 
and the maximum aggregate-size; thus, the distribution 
of the aggregates [114]. On the theoretical viewpoint 
the distribution is scale-free. However, due to the 
finite size of the population of aggregates we believe 
it is possible to assign a characteristic scale. We 
think this is particularly true in the case of "hard" 
truncated power-law distributions, that are in between 
heavy-tail and log-normal distributions. We underline 
the importance of a further understanding of the 
distribution of aggregates for the understanding of 
system self-organization. 

4. Discussion 

The study shows that the box-counting provides 
reliable estimates of the fractal dimension of aggregates 
using only occurrences of systems components. The 
box-counting does not capture finite-size effects but it 
captures hard-truncation points of the aggregate-size 
spectrum. This is important because different scaling 
regimes, that are possibly associated with different 
system dynamics, can be captured by using the box- 
counting method. Because of the validity of the box- 
counting that assumes scale-invariance of aggregates, 
we demonstrate that power-law distributions of the 
aggregate-size imply fractal patterns as found by other 
studies [103]. However, the contrary does not hold; 
scale-invariant patterns are not necessarily realizations 
of systems with power-law aggregate-size distributions. 
We demonstrate that the box-counting, the perimeter- 
area relationship, and the Korcak's law provide close 
estimates of the same fractal dimension. Models of 
higher complexity provide the smallest estimate of the 
fractal dimension based on the Korcak's law {D^) just 
using aggregate sizes, while the box-counting provides 
the largest estimate (Dj,). The fractal dimension 
calculated using the PAR (D c ) is in between D K and 
Dj,. Hence, the perimeter-area relationship is possibly 
the best estimate of the fractal dimension because it 
considers perimeters and areas of aggregates. Hence, 
the fractal estimation from the PAR is based on a 
richer information than other fractal dimensions of the 
aforementioned methods. 

We verify that aggregates can be considered as islands 
and their perimeter as a curve mirroring the sculpting 



network in the landscape. We show that the probability 
of exceedence of the drainage area, and the Hack's 
law are the the Korcak's law and the perimeter-area 
relationship (PAR) for river basins respectively. We 
formulate a probabilistic characterization for animate 
and inanimate systems extending the fractal theory 
of river basins to aggregates of any animate and 
inanimate system. At the system scale aggregates of 
system components, from bacteria to elephants, are the 
byproduct of dispersal networks of single individuals, 
such as for river basins and landslides that are the 
byproduct of river networks. The Korcak's law, that 
is the aggregate-size spectrum is verified also for the 
cumulative drainage area and for the areas of merging 
subbasins in river basins sampled along a self-similar 
or a self-affine mainstream. In analogy, mainstreams 
are for subbasins like Brownian-Levy paths of species 
that disperse in ecosystems. The ansatz (Equation 1) 
is verified by comparing the Hack's exponent, h, from 
the perimeter-area relationship and its estimate derived 
from the Korcak's law. The Hurst exponent, H, from the 
PAR, is tested against the exponent derived from the 
allometric relationships between aggregate's diameters. 
This test is not verified for h > 1 supposedly because 
the edge-effect is very high, and for h < 0.5 because the 
compactness of aggregates is very high. Both situations 
are not observed in river basins. 

A novel analytical formulation is provided for 
the probability distribution of the aggregate-size. 
The analytical formulation is a generalized Korcak's 
law that describes the double-Pareto and Pareto 
distributions with finite-size and truncation effects. 
Double "fractal regimes" evidenced by the double- 
Pareto distribution are possibly signatures of different 
system dynamics such as it is observed for landslides 
and elephants. The finite-size effects and the hard 
truncation in the aggregate-size spectrum are caused 
by geometrical constraints of the ecosystem (for 
instance, the maximum extent of the ecosystem that 
determines an upper limit to the growth of aggregates) 
or biological constraints. The power-law distribution 
of the aggregate-size can be a manifestation of the 
self-organization of species along a network, such as 
the case of river basins. For the same double-Pareto 
distribution of the aggregate-size a virtually infinite 
number of spatial arrangements of aggregates can 
be generated, likely with different fractal dimensions. 
Thus, future research is anticipated toward the 
understanding of the linkage between aggregate-size 
spectrum and the spatial distribution of aggregates 
that has relevant consequences for metapopulation 
dynamics of species, hydrogeomorphological dynamics, 
and epidemic spreading to name just few examples. 
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5. Conclusions 

The characterization of systems patterns is crucial as 
a first step to possibly understand the fundamental 
drivers of systems processes, and to develop indicators 
that are capable to predict fluctuations of these 
patterns. Here we focus on aggregation features 
of natural animate and inanimate systems and in 
particular on those that are characterized by a power- 
law distribution of the aggregate-size. The power-law 
is manifesting a resilient configuration of the system 
[26]. Aggregation phenomena are also observed in 
human systems (for example, cities) and analogies have 
been drawn between natural and human systems by 
recent studies [16]. We propose the box-counting as 
a parsimonious null-method for accurately estimating 
the aggregate-size distribution without the knowledge 
of any detailed information, rather than system 
component occurrences, about the systems analyzed. 
For example we did not use any biological information 
of the species investigated. The box-counting can 
be tested against other more "biologically-complex" 
models which provide other complexity measures, such 
as area, perimeter, and diameters of aggregates. This 
validation, at least for the cases analyzed, confirms 
that the occurrences of system components or just 
the occurrences of aggregates, if available, are enough 
for the box-counting to predict the aggregate-size 
distribution reliably. 

The introduced analytical formulation of the 
aggregate-size distribution can model different Pareto 
distributions, such as double-Pareto, and Pareto with 
soft and hard truncations by properly adjusting the 
distribution parameters. The box-counting does not 
reproduce the tail of the aggregate-size distribution 
in presence of finite-size effects. This range of the 
distribution is very narrow and few system components 
experience such level of aggregation. However, these 
systems components are the largest in size; hence, these 
system components may be vital for the whole systems 
(for instance when they are the hub of system function). 

Our results show that position, and topological 
features of any aggregate are determined by global 
system processes governed by a physical network, 
a process network or both. The fractal dimension 
of each aggregate is an estimate of the fractal 
dimension of the whole pattern of aggregates because 
aggregates are tightly linked. We believe that the 
development of detailed process-based models which 
result in power-law distributions of the aggregate- 
size is certainly necessary to verify these conclusions 
and to test how and which conditions change the 
aggregate-size distribution from power-law to another 
type of distribution. However, that is not sufficient 
if computational and theoretical methodologies for 
characterizing aggregation patterns, such as the ones 



here provided, are not available. The aggregate-size 
spectrum is in fact an important indicator of system 
form and function. For this motivation methods 
that capture such organization (for example just by 
assessing the fractal dimension) and its variation due 
to endogenous and exogenous changes [26, 103] are 
desired. 

This is also useful for designing animate and 
inanimate man-made systems with a desired degree 
of aggregation of system components, multiple levels 
of aggregation in the same system space dictated 
by different power-law regimes of the aggregate-size 
distribution, or time varying aggregation. 
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Table 1. Fractal dimensions, scaling exponents, 
and validation. The systems are listed in order of 
their average aggregate-size which is proportional on 
average to the body-mass of system components. The 
box-counting for subbasins, landslides, and E. coli 
is performed considering the center of mass of the 
aggregates as point-occurrence patterns. Df,, D K , D c 
are the fractal dimensions from the box-counting, the 
Korcak's law (Eq. 4), and the PAR (Eq. 2). A double 
scaling is observed for elephants and landslides. H is 
derived from the ansatz (L ± ~ L¥ ), h from the PAR. 
H c and are compared to H and h for validation of 
the theory. h K is derived from the Korcak's law and it 
is: (i) 1 - e from the Korcak's law for the river basin 
drainage area (we consider self-affine basins); (ii) e 
for the aggregates of all the species in the self-affine 
case (H < 1); and (hi) 2-e for the self-similar case 
of aggregates (H = 1) that is the case of the E. coli. 
H c is derived from the PAR and it is: (i) df/h - 1 for 
0.5 < h< 1.0; and, (ii) 1- | d f /h - 1 | for h< 0.5 and 
h > 1. (Ly) is the average aggregate diameter which is a 
characteristic length of the whole mosaic of aggregates, 
S max and C max the maximum values for the aggregate 
area and perimeter. Variation of scaling exponents is 
estimated +0.04. Variabilities of measured exponents 
are standard errors found by a Maximum Likelihood 
Estimation method (Section redmle) bootstrapping 
over cases and deriving scaling exponents by the linear 
and the jackknife models [72]. 



Figure 1. Schematic representation of the theoretical 
construct. An ideal self-similar or self-affine curve 
is drawn in the system domain where aggregates are 
self-organized. The curve can be the path followed by 
species (for instance, a dispersal network similar to a 
Brownian walk [33, 34]) or a physical network (such 
as a river network). Other curves of the same type 
can be traced within each aggregate. The curve can 
be imagined as the mainstream of a river. Aggregates 
are characterized by allometric relationships such as 
for river basins. S is the aggregate area, / is the length 
of the curve, Ly and are the aggregate diameters. 
The same quantities are evidenced in Figure 2 for river 
basins. Along the curve it is possible to sample the 
aggregate areas sequentially (S l7 S 2 , ...), or to sample 
the sum of aggregate areas (Si, Si + Sj, ...). This leads 
to two different probability distributions. The center of 
masses are represented as grey dots. 

Figure 2. Animate and inanimate systems 
considered in the study. From (a) to (f) the species 
are shown in order of the extension of the system 
domain where they occur, (a) E. coli bacteria colonies 
(courtesy of [2]). (b) Tanaro subbasins identified by 
their drainage divides in red [63]. (c) Snowy Plover 
nest occurrences in 2006 along the Florida Gulf coast, 
and closeup of the box-counting applied to the upper 
part of the St. Joseph Peninsula State Park [66]. (d) 
historical Arno landslides from the Synthetic Aperture 
Radar (SAR) images of the European Remote Sensing 
spacecraft (the center of mass of landslides is reported) 
[67]. (e) elephant occurrences in 2005 in the Kruger 
National Park [68] (plane survey), (f) 100-th most 
common species of fishes, and of big trees associated 
to each direct tributary area (DTA, ~ 3900 km 2 ) in the 
Mississippi-Missouri River Basin [9, 42, 55]. 

Figure 3. Probability of exceedence of the 
aggregate-size from model predictions and the box- 
counting. The value of the reported scaling exponents 
e and p is half of the fractal dimension (D^/2, Equation 
4). The probability of exceedence of the aggregate-size 
is the Korcak's law (Equation 4) derived from model 
predictions. The plots from (a) to (f) are in order of 
their average aggregate-size which is proportional on 
average to the body-mass of system components. The 
aggregate-size unit is reported along on the x-axis. For 
fishes and big trees of the MMRS the aggregate-size 
is expressed in "local-community" units (LC), where 
a LC unit is the direct tributary area whose average 
extension is 3900 km 2 . The binned box-counting 
relationship is reported with grey squared dots. The 
fractal dimension corresponding to the box-counting 
method is reported in Table 1 . The KS-D statistic of the 
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double-Pareto distribution on the aggregate-size from 
the box-counting is 0.87, 0.90, 0.96, 0.97, 0.93, 0.92 
with respect to the other distributions (Section 2.8) for 
the systems considered from (a) to (f). 

Figure 4. Collapse test and perimeter-area 
relationship, (a) Intersystems collapse test of the 
scaling ansatz (Equation 1). P(> s) s £ is rescaled to 
the same constant for all the species, (b) normalized 
perimeter-area relationship (PAR) (Equation 2). The 
normalized PAR, C/C max ~ {S/S max ) h , provides a direct 
estimation of the Hack's exponent h. 
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